Step 2: provenance hash chain made tamper-evident (V-L1-B1 + V-L2-N1/C1-4/L1-2) by hyperpolymath · Pull Request #33 · hyperpolymath/verisimiser

hyperpolymath · 2026-05-13T01:01:17Z

Summary

Step 2 of the bottom-up plan — makes the Provenance octad concern actually tamper-evident. One commit, eight issues, ~440 LOC of doc + code + tests.

Depends on #24 (Step 0+1) — this branch is stacked on step0-1-ground-clearing. Merge #24 first; this PR will auto-rebase onto main.

What changes

Doc (foundational)

docs/theory/provenance-threat-model.adoc — four-adversary model (read-only / sidecar-write / sidecar-rewrite / clock-skew), per-adversary protection matrix, field-coverage + canonical-encoding + append-serialisation requirements, anchor/notary future work, open questions. Every Step 2 issue cites a section. Closes V-L1-B1: write provenance threat model (foundational doc for Step 2) #25.

Code

src/abi/mod.rs — ProvenanceEntry::compute_hash rewritten with domain separation (b"verisim-prov-v1\0"), length-prefixed encoding for all variable-length fields, canonical timestamp (i64_le(secs) || u32_le(nanos)), and full field coverage (previous_hash, entity_id, operation, actor, timestamp, before_snapshot, transformation). verify(), genesis(), chain() updated to match. Closes V-L2-C1: hash full audit (actor + before_snapshot + transformation) with domain separation #27, V-L2-C2: hash timestamp as i64+u32 canonical, not RFC3339 string #28.
src/tier1/provenance.rs — orphan ProvenanceRecord deleted; file now re-exports abi::ProvenanceEntry and is the future home of V-L1-C1's append_provenance write-path helper. Closes V-L2-N1: deduplicate ProvenanceRecord vs ProvenanceEntry (do first) #26.
src/codegen/overlay.rs — verisimdb_provenance_log gains UNIQUE INDEX ux_provenance_chain ON (entity_id, previous_hash) (V-L2-L2 — makes chain forks structurally impossible, also enforces exactly one genesis per entity since genesis carries previous_hash=''). New verisimdb_provenance_chain_head table emitted alongside (V-L2-L1 — per-entity head pointer for write-path lock). Closes V-L2-L1: per-entity serialisation prevents chain forks (write-path lock) #31, V-L2-L2: UNIQUE INDEX(entity_id, previous_hash) makes forks structurally impossible #32.

Tests

src/abi/mod.rs::tests — 8 new unit tests: tamper-detection for each of (entity_id, actor, before_snapshot, transformation, operation, previous_hash), a canonical-timestamp round-trip, and a 4-entry chain mutation-matrix that asserts every field mutation on every entry breaks verify(). 26 → 35 lib tests. Closes V-L2-C3: positive tamper-detection tests for actor / before_snapshot / transformation #29.
tests/integration_test.rs::test_provenance_chain_integrity_multi_step — the assertion that codified the bug ("Actor is not part of hash — tamper to actor alone is invisible") is replaced with the inverse: tampering with actor and with before_snapshot both break verify(). Closes V-L2-C4: remove the wontfix tamper-evidence test (it asserts the bug) #30.
src/codegen/overlay.rs::tests — 2 new DDL tests confirming the UNIQUE INDEX and chain_head table are emitted.

Doc bookkeeping

docs/architecture/TOPOLOGY.md — line for tier1/provenance.rs updated to reflect the re-export.

Test plan

cargo fmt --all -- --check clean
cargo clippy --all-targets -- -D warnings clean
cargo test reports 35 lib + 9 integration = 44 tests, 0 failed
CI green on this PR after Step 0+1 ground-clearing: ADRs, deletions, test fix, lint cleanup, README/ROADMAP, CI concurrency #24 merges and base is rebased
V-L1-C1 (write-path append_provenance against the new chain_head table) — follow-up, not in scope here

Threat-model note

The threat model is explicit that the chain protects only what's in the preimage, and only against read-only and append-only sidecar-write adversaries without external anchoring. Against a sidecar-rewrite adversary (root on the sidecar host, restore from backup) only the prefix up to the most-recent externally-attested hash is protected — and verisimiser has no notary integration yet (deferred to ADR-0005). The README's "tamper-evident" framing is now formally bounded.

🤖 Generated with Claude Code

…-L2-L1..L2 Step 2 of the bottom-up plan. Brings the Provenance octad concern up to the claim made in the README: tampering with any audit-relevant field in a logged entry breaks `verify()`. V-L1-B1 — docs/theory/provenance-threat-model.adoc: Four-adversary model (R / SW / SR / SR+CK), per-adversary protection matrix, the field-coverage and canonical-encoding requirements that bind V-L2-C1 + V-L2-C2, the append-serialisation requirement that binds V-L2-L1 + V-L2-L2, anchor/notary future work, open questions (None vs Some(""), chain_id). Each Step 2 issue cites a section. V-L2-N1 — deduplicate ProvenanceRecord vs ProvenanceEntry: Delete src/tier1/provenance.rs::ProvenanceRecord (orphan duplicate of abi::ProvenanceEntry with its own compute_hash that risked drifting). tier1/provenance.rs now re-exports the canonical type; the file is the future home of V-L1-C1's write-path helpers (sqlite3_update_hook → append_provenance). TOPOLOGY.md updated. V-L2-C1 — full-field, domain-separated hash: compute_hash signature changes from (4 strs) to (5 strs + DateTime + 2 Options). New preimage = domain tag b"verisim-prov-v1\0" || length-prefixed (previous_hash, entity_id, operation, actor) || canonical timestamp (V-L2-C2) || length-prefixed (before_snapshot, transformation). All seven fields participate. PROV_DOMAIN_TAG versioning is reserved for a future SHA-256→? migration. verify(), genesis(), chain() all pass the full field set. V-L2-C2 — canonical timestamp: Replace timestamp.to_rfc3339() (multiple valid forms per instant) with i64_le(timestamp()) || u32_le(timestamp_subsec_nanos()), 12 bytes total. Round-trip unit test asserts two construction paths that yield the same instant produce the same hash. V-L2-C3 — positive tamper-detection tests: Eight new unit tests in abi::tests covering each hash-covered field (entity_id, actor, before_snapshot, transformation, operation, previous_hash, timestamp) plus the canonical-encoding property test plus a 4-entry chain mutation-matrix that asserts every field mutation on every entry breaks verify(). 9 new test cases (26 → 35 lib tests). V-L2-C4 — flip the wontfix test: tests/integration_test.rs::test_provenance_chain_integrity_multi_step previously codified the bug ("Actor is not part of hash — tamper to actor alone is invisible"). Replaced with assertions that tampering with actor and with before_snapshot both break verify(). V-L2-L1 — chain_head table + write-path serialisation spec: codegen/overlay.rs emits a new verisimdb_provenance_chain_head (entity_id PK, head_hash, updated_at) alongside the provenance log. The write-path lock (SELECT … FOR UPDATE / BEGIN IMMEDIATE on the head row, INSERT into log, UPDATE head, COMMIT) is specified in the threat-model doc and the table-generator docstring. The library function that performs the transaction is V-L1-C1's job; V-L2-L1 only lands the schema. V-L2-L2 — UNIQUE INDEX makes forks unrepresentable: CREATE UNIQUE INDEX IF NOT EXISTS ux_provenance_chain ON verisimdb_provenance_log(entity_id, previous_hash). Genesis rows all carry previous_hash='' so the same constraint enforces exactly one genesis per entity. Two new DDL tests assert presence of both the UNIQUE INDEX and the chain_head table. Verified locally: - cargo fmt --all -- --check clean - cargo clippy --all-targets -- -D warnings clean - cargo test reports 35 + 9 = 44 tests, 0 failed Closes #25, #26, #27, #28, #29, #30, #31, #32 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

This was referenced May 13, 2026

Step 3: manifest semantics (V-L2-D1 explicit Constraints + V-L2-E1 conflict-detection + V-L2-O1 init flags) #37

Closed

Step 4 (partial): DDL hardening — V-L2-G1/H1/H2/I1/J1/K1 #67

Merged

hyperpolymath merged commit 7a4ccfd into step0-1-ground-clearing May 13, 2026
15 checks passed

hyperpolymath deleted the step2-hash-chain branch May 13, 2026 01:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step 2: provenance hash chain made tamper-evident (V-L1-B1 + V-L2-N1/C1-4/L1-2)#33

Step 2: provenance hash chain made tamper-evident (V-L1-B1 + V-L2-N1/C1-4/L1-2)#33
hyperpolymath merged 1 commit into
step0-1-ground-clearingfrom
step2-hash-chain

hyperpolymath commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hyperpolymath commented May 13, 2026

Summary

What changes

Doc (foundational)

Code

Tests

Doc bookkeeping

Test plan

Threat-model note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant